fix(backend): resolve five backend bugs reported in #259#301
fix(backend): resolve five backend bugs reported in #259#301oguzhan-canada wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR addresses issue #259 by improving state propagation in the retriever graph, hardening PDF ingestion against corrupted files, and making configuration more flexible via environment variables.
Changes:
- Add missing
context_listkey to the agent state schema so LangGraph doesn’t drop it. - Fix a typo in the summarise prompt template and add regression tests.
- Improve configurability and resilience: respect
FAISS_DB_PATH, handle corrupted PDFs, and mapGOOGLE_GEMINIversions to model names.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/src/agents/retriever_typing.py | Adds context_list to the graph state schema for propagation. |
| backend/tests/test_retriever_typing.py | Regression test ensuring context_list is declared on AgentState. |
| backend/src/prompts/prompt_templates.py | Fixes “avaiable” typo and updates the fallback sentence. |
| backend/tests/test_prompt_templates.py | Adds a regression test for the corrected fallback sentence/typo. |
| backend/src/tools/process_pdf.py | Returns [] on PdfStreamError to avoid crashing on corrupted PDFs. |
| backend/tests/test_process_pdf.py | Adds/reenables a corrupted PDF test expecting an empty result. |
| backend/src/vectorstores/faiss.py | Allows overriding the FAISS DB path via FAISS_DB_PATH. |
| backend/tests/test_faiss_vectorstore.py | Tests default path fallback and env var override behavior. |
| backend/src/api/routers/helpers.py | Adds _resolve_gemini_model mapping for GOOGLE_GEMINI values. |
| backend/tests/test_api_helpers.py | Adds tests for known mappings and fallback behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| class AgentState(TypedDict): | ||
| messages: Annotated[list[AnyMessage], add_messages] | ||
| context: Annotated[list[AnyMessage], add_messages] | ||
| context_list: Annotated[list[str], add_messages] |
| def get_db_path(self) -> str: | ||
| env_path = os.getenv("FAISS_DB_PATH") | ||
| if env_path: | ||
| return os.path.abspath(env_path) | ||
| cur_path = os.path.abspath(__file__) | ||
| path = os.path.join(cur_path, "../../../", "faiss_db") | ||
| path = os.path.abspath(path) # Ensure proper parent directory |
| def _resolve_gemini_model(gemini_version: str | None) -> str: | ||
| """Map a GOOGLE_GEMINI version string to a Gemini model name. | ||
|
|
||
| Falls back to ``gemini-2.0-flash`` when the value is unset or unknown. | ||
| """ | ||
| return _GEMINI_MODEL_MAP.get(gemini_version or "", "gemini-2.0-flash") |
| documents = loader.load_and_split(text_splitter=text_splitter) | ||
| except PdfStreamError: | ||
| logging.error(f"Error processing PDF: {file_path} is corrupted or incomplete.") | ||
| return [] |
…ect#259 Each fix ships with a regression test (RED->GREEN): - tools/process_pdf.py: return [] when a PDF raises PdfStreamError instead of falling through to an UnboundLocalError on the undefined `documents`. - vectorstores/faiss.py: honor the FAISS_DB_PATH environment variable in get_db_path() (previously the configured value was ignored). - agents/retriever_typing.py: add the missing `context_list` field to the AgentState schema so LangGraph propagates retrieved context downstream. - api/routers/helpers.py: derive the Gemini model from GOOGLE_GEMINI via a small mapping (default gemini-2.0-flash) instead of hardcoding it. - prompts/prompt_templates.py: fix the "avaiable" typo in summarise_prompt_template ("Sorry, it's not available..."). ruff format/check and mypy (strict) pass. Closes The-OpenROAD-Project#259 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Oguzhan Tekin <oguzhantekin@gmail.com>
2f9227a to
998ca6f
Compare
|
Friendly ping - this resolves the five backend bugs from #259 with per-bug regression tests; DCO passes and it's mergeable against |
Summary
Fixes all five bugs reported in #259 (thanks @sanjibani for the detailed review). Each fix is small, isolated, and covered by a regression test written test-first.
tools/process_pdf.pyUnboundLocalErrorafter a corrupted PDF raisesPdfStreamError— theexceptlogged but never returned, then the loop iterated the unassigneddocumentsreturn []in theexceptblockvectorstores/faiss.pyget_db_path()ignoredFAISS_DB_PATHand always used the hardcoded../../../faiss_dbFAISS_DB_PATHfirst, fall back to the computed defaultagents/retriever_typing.pyAgentStatehad nocontext_list, so the value returned byToolNode.get_node()was silently dropped by LangGraphcontext_list: Annotated[list[str], add_messages]api/routers/helpers.pygemini-2.0-flash, ignoringGOOGLE_GEMINI(whichconversations.pyhonors)GOOGLE_GEMINI→ model map (defaultgemini-2.0-flash)prompts/prompt_templates.py"Sorry its not avaiable...""Sorry, it's not available in my knowledge base."Tests
7 new/updated unit tests (red → green):
test_process_pdf_docs_corrupted_file_returns_emptytest_get_db_path_respects_env_var, plus hardened the existingtest_get_db_pathto be deterministic w.r.t.FAISS_DB_PATHtest_agent_state_includes_context_list(newtests/test_retriever_typing.py)test_resolve_gemini_model_maps_known_versions,test_resolve_gemini_model_defaults_for_unknown_or_unsettest_summarise_prompt_template_has_no_typomake checkclean:mypy .(strict) → no issues in 63 source files;ruff check→ all checks passed;ruff formatclean. Full backend suite passes.Closes #259